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I .  INTRODUCTION 


A  mathematical  model  fitting  problem  arises  when  one  compares  real 
observations  with  theoretical  predictions.  The  observations  always 
contain  observational  inaccuracies  and,  likewise,  the  theory  of  the 
prediction  can  be  inadequate.  If  discrepancies  between  observations  and 
predictions  are  unacceptably  large  for  a  particular  situation,  then  one 
is  faced  with  the  task  to  adjust  in  a  rational  manner  either  the  obser¬ 
vations,  or  the  theory,  or  both  so  that  an  acceptable  mathematical 
description  of  the  event  can  be  established.  The  problem  can  be  sub¬ 
divided  conveniently  into  three  subtasks,  each  of  which  requires  a 
different  approach  and  background  information. 

First,  one  has  to  chose  a  model.  Normally,  this  requires  supporting 
information  from  engineering,  physics,  geometry,  etc.,  which  may  suggest 
or  postulate  a  reasonable  mathematical  description  of  the  observable 
event.  We  shall  assume  in  this  report  that  the  model  is  formulated  as 
a  system  of  equations  containing  observations  and,  possibly,  also  some 
undetermined  model  parameters. 

Once  the  model  is  selected,  one  can  compare  predicted  values  of 
observable  quantities  with  corresponding  observations.  The  comparison 
provides  the  basis  for  a  rational  adjustment  of  the  observations  and/or 
of  the  model.  This  subtask  of  the  problem  is  a  purely  mathematical  part 
of  model  fitting  and  it  belongs  to  the  category  of  ill-posed  problems. 

Its  mathematical /numerical  treatment  is  independent  of  the  other  two 
subtasks,  i.e. ,  of  applications.  We  shall  be  concerned  with  this  part 
of  the  problem  in  the  present  article. 

After  the  adjustments  have  been  carried  out,  one  has  to  validate 
the  mathematical  model,  unless  it  has  been  prescribed,  e.g.,  by  the 
geometry  of  the  event.  The  validation  involves  typically,  but  not 
necessarily,  a  statistical  analysis  of  the  discrepancies  between  obser¬ 
vations  and  predictions.  The  result  of  the  validation  process  may  be  a 
new  formulation  of  model  equations  and  subsequent  fitting,  i.e.,  a 
repetition  of  the  whole  task  until  some  validation  criterion  is  satisified. 
We  shall  not  discuss  this  part  of  the  problem,  noticing  only  that  the 
results  of  the  second  subtask  provide  the  data  basis  necessary  for  a 
validation. 

If  the  model  equations  are  not  linear  then  the  model  fitting  prob¬ 
lem  generally  leads  to  systems  of  complicated  simultaneous  equations 
and  corresponding  numerical  difficulties  may  arise.  Often  the  numeri¬ 
cal  treatment  can  be  simplified  by  a  reformulation  of  the  model  equa¬ 
tions,  particularly  by  introduction  of  new  variables  through  variable 
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transformations.  Such  manipulations  have  been  suggested  in  textbooks 
and  are  routinely  used  in  applications.  Examples  of  recently  published 
applications  where  variable  transformations  have  been  used  are  Refer¬ 
ences  8,  9,  and  10, 

A  closer  investigation  of  variable  transformations  in  model  fitting 
problems  suggests  that  the  formulations  should  be  used  more  cautiously 
than  some  of  the  texts  suggest.  Therefore,  we  shall,  in  this  report, 
present  an  investigation  of  some  consequences  of  the  transformations 
and  draw  conclusions  about  their  usefulness  for  the  simplification  of 
the  numerical  treatment  of  model  fitting  problems. 

In  Section  II  we  shall  formulate  the  mathematical  model  fitting 
problem  in  general  terms  and  discuss  the  effects  that  can  be  anticipated 
from  manipulations  of  model  equations.  In  Section  III  we  shall  specialize 
the  considerations  to  nonlinear  least  squares  problems  and  produce 
explicit  formulas  that  are  needed  in  such  problems.  Some  examples  will 
be  presented  in  Section  IV,  and  Section  V  will  summarize  the  conclusions 
that  can  be  drawn  from  the  theoretical  discussions  and  from  examples. 
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6 


II.  GENERAL  ASPECTS  OF  MATHEMATICAL  MODEL  FITTING 


Let  the  model  equations  be 


A(X)  0  =  0  , 


(1) 


where  XtRn  is  the  vector  of  all  observations,  0eRP  is  a  model  parameter 
vector,  and  A(X)  is  an  operator  that  operates  on  0  and  has  a  range  Rr. 
We  assume  that  the  following  relations  hold  between  the  dimensions  n, 
r,  and  p: 


n  il  r  >  P  L  0  •  (2) 


By  permitting  the  dimension  p  to  be  zero,  we  include  in  our  consider¬ 
ations  also  cases  in  which  the  model  equations  do  not  contain  free 
parameters.  Then  Eq.  (1)  reduces  to  A(X)  =  0. 

Typical  for  applications  are  cases  in  which  the  r  Eqs.  (1)  for  0 
are  independent  and,  because  of  Eq.  (2),  do  not  have  a  solution.  Then 
one  replaces  the  model  equations  by  another  system 


A(X)  0  =  0  , 


(3) 


chosing  the  operator  A(X)  such  that  it  approximates  A(X)  and  has  a 
solution.  The  determination  of  A(X)  can  be  considered  as  the  central 
part  of  the  model  fitting  problem. 

In  order  to  have  a  measure  for  the  approximation,  we  introduce  a 
metric  for  the  operators.  Let  p[A(X),  A(X)]  be  a  metric.  Then  one 
can  formulate  the  mathematical  model  fitting  as  the  following  constrained 
minimization  problem: 


A(X) 0  =  0  ,  W|p[A(X),  A(X)]|  =  min. 


(4) 


where  W{p}  is  a  generally  convex  object  function.  The  choice  of  the 
metric  p  and  of  the  object  function  W{p}  determines  the  type  of  the 
model  fitting,  e.g.,  least  squares,  maximum  norm,  etc. 

We  shall  now  discuss  the  selection  of  an  approximate  operator  A(X) . 
First,  we  notice  that  the  model  operators  A(X)  and  A(X)  are  generally 
needed  and  defined  only  within  a  finite  neighborhood  of  the  observations 
X.  Therefore,  assumptions  about  properties  of  the  operators  need  to  be 
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made  for  that  neighborhood  only.  Let  the  neighborhood  consist  of  all 
points  Z  =  X  +  C,  whereby  C  is  restricted  component-wise  by 


T±9  i  *  1»  2,  n  .  (5) 


The  intervals  (y^,  F^)  normally  contain  zero,  but  exceptions  are 
possible  and  do  occur  in  applications.  Second,  we  assume  that  within 
the  neighborhood  (5)  A(Z)  is  a  continuous  function  of  Z.  Then  a  rea¬ 
sonable  choice  of  A(X)  is 


A(X)  =  A(X  +  C)  . 


(6) 


The  choice  achieves  a  natural  parametrization  of  the  approximation. 
The  approximation  parameter  is  the  vector  CeRn  and  the  operator  A(X) 
depends  continuously  on  the  parameter  within  the  restrictions  (5). 

The  parametrized  model  fitting  problem  can  be  formulated  as 
follows : 


A(X  +  C)0  =  0  , 

wjp[A(X  +  C),  A(X)]  (  =  min. 


(7) 


The  quantities  to  be  determined  by  Eq.  (7)  are  the  approximation  parameter 
C  and  the  model  parameter  9.  We  assume  that  the  solution  vector  C  is 
within  the  limits  specified  by  Eq.  (5). 

We  will  need  in  the  sequel  some  differentiability  properties  for 
the  model  operator.  As  far  as  X  is  concerned,  we  assume  the  properties 
to  hold  within  the  neighborhood  (5).  With  respect  to  9  we  assume 
that  a  similar  neighborhood  exists  in  the  vicinity  of  the  solution  of 
Eq.  (7)  in  which  A(X)9  is  a  continuous  function  of  0.  The  differen¬ 
tiability  assumptions  are  that  A(X  +  C)0  is  twice  differentiable  with 
respect  to  all  its  n  +  p  arguments  within  the  cartesian  product  space  of 
the  neighborhoods  of  X  and  0.  We  also  assume  that  within  that  space 


,  BA 

rank  sx 


(8) 


and  define 
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p  [A(Z),  A(X)]  =  I |z  -  x| I  . 


(9) 


p  is  a  metric  within  the  neighborhood  in  which  Eq.  (8)  holds.  We  also 
assume  that  the  model  equations  do  not  contain  redundant  parameters. 
The  assumption  may  be  expressed  as  the  requirement 


rank 


3A(X)0  . 


30 


=  P 


(10) 


With  the  specialization  Eq.  (9),  the  model  fitting  problem  becomes 


A(X  +  C)0  =  0  , 

W  {p  [A(X  +  C) ,  A(X)] }  -  W  )  | | C | |  }  =  min. 


(11) 


Eq.  (11)  is  an  abstract  formulation  of  common  model  fitting  problems. 

The  difference  C  between  the  observations  X  and  the  "corrected  obser¬ 
vations"  X  +  C  is  called  the  residual  vector.  In  the  formulation  (11), 
we  require  that  a  norm  of  the  residual  vector  be  minimized,  subject  to 
model  equations  which  have  to  be  satisfied  at  X  +  C.  The  model  param¬ 
eter  vector  0  is  not  essential  in  this  formulation.  The  number  of  model 
parameters  may  be  zero  and  it  is  normally  orders  of  magnitudes  smaller 
than  the  number  of  approximation  parameters;  i.e.,  residuals.  The 
determination  of  0  can  be,  of  course,  in  some  applications  more  important 
than  the  determination  of  C,  but  this  is  not  always  the  case. 

A  least  squares  model  fitting  problem  is  a  special  case  of  Eq.  (11), 
characterized  by  a  particular  choice  of  the  norm  in  the  definition 
Eq.  (5),  and  of  the  object  function  W{p}.  The  least  squares  metric  is 

p[A(Z),  A(X)]  =  1 1 z-x| I  =  [(Z-X)TR-1(Z-X)]1/2  ,  (12) 

where  R  is  an  estimate  of  the  variance- covariance  matrix  of  the  obser¬ 
vations.  The  least  squares  object  function  is 

W  {  p  }  =  P2  .  (13) 
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Therefore,  the  least  squares  model  fitting  problem  is  defined  by 


A(X  +  c)t  “  0  , 

W  =  | | c | | ^  -  cTR  1c  *  min. 


(14) 


In  Eq.  (14)  we  have  used  c  and  t  instead  of  C  and  0,  respectively, 
thus  indicating  the  least  squares  values  of  both  parameter  vectors. 

The  use  of  R  ^  as  a  norm  matrix  in  the  definition  (11)  makes  the 
norm  j  |c| |  and  W  dimensionless,  which  is  very  convenient  when  fitting 
results  are  compared.  If  the  variance- covariance  matrix  R  is  known 
exactly,  and  the  observational  errors  are  normally  distributed,  then  the 
solution  of  Eq.  (14)  is  a  maximum  likelihood  solution  of  the  approximation 
problemll.  The  same  maximum  likelihood  solution  is  obtained  if  R  approx¬ 
imates  the  variance-covariance  matrix  up  to  an  unknown  factor.  In 
applications  one  has  to  be  content  with  an  estimate  of  R.  Then,  often, 
the  off-diagonal  elements  are  assumed  to  be  zero  as  a  matter-of-course. 
Because  the  results  of  the  model  fitting  depend  on  R,  such  assumptions 
should  not  be  made  without  having  reasons  that  zero  is  a  better  approxi¬ 
mation  than  a  non-zero  value.  The  theoretical  treatment  is  not  complicated 
by  the  assumption  that  R  is  not  diagonal,  nor  are  the  numerical  compli¬ 
cations  unsurmount able.  Realistic  estimates  of  R  are,  however,  important 
for  the  interpretation  of  the  results,  and  for  the  validation  of  the 
fitting. 


We  solve  the  optimization  problem  (11)  or  (14)  using  Lagrange 
multiplier  technique,  and  call  the  multipliers  correlates,  as  usual  in 
adjustment  problems.  Let  KeR  be  a  correlate  vector  and  let  the  modified 
object  function  be 


w  =  |w{||c||(  -  KTA(X  +  c)e  . 


(15) 


Necessary  conditions  for  the 
obtained  by  setting  equal  to 
respect  to  the  unknown  C,  0, 
normal  equations. 


solution  of  the  optimization  problem  are 
zero  the  partial  derivatives  of  W  with 
and  K.  This  yields  the  following  set  of 


22 

H.J.  Britt  and  R.H .  Luecke " The  Estimation  of  Parameters  in  Nonlinear  j 
Implicit  Models  j  ”  Technometric  s3  Vol.  25 pp.  233-247 3  2973 . 
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1  9_ 

2  9C 


w|||c|||  - 


9_ 

9C 


[kta(x  +  c)e]  =  o  , 


(16a) 


fe  [kta(x  +  C)e]  =  o  , 


(16b) 


a(x  +  c)e  =  o  . 


(16c) 


The  solution  of  the  model  fitting  problem  (11)  is  among  the  solutions 
of  Eqs .  (16).  On  the  other  hand,  one  cannot  guarantee  that  a  particular 
solution  of  the  normal  equations  corresponds  to  the  absolute  minimum 
solution  of  Eq.  (11),  nor  is  the  uniqueness  of  the  solution  given.  An 
investigation  of  these  complications  is  not  the  subject  of  this  paper. 
Mostly,  such  problems  can  be,  and  are  taken  care  of  by  ad  hoc  measures 
based  on  background  information  from  the  application.  Therefore,  we 
simplify  our  present  theoretical  discussion  by  assuming  in  this  section 
that  a  numerical  solution  of  Eqs.  (16)  can  be  obtained,  and  that  it  has 
been  verified  as  the  absolute  minimum  solution  of  Eq.  (11) . 

In  least  squares  problems,  the  first  term  3W/3C  in  Eq.  (16a)  is 
linear  with  respect  to  C.  Nonlinear  expressions  which  could  be  possibly 
simplified  by  algebraic  manipulations  may  occur  in  the  second  term  in 
Eq.  (16a),  and  in  Eqs.  (16b)  and  (16c).  The  structure  of  these  terms 
strongly  depend  on  the  form  in  which  the  model  Eqs.  (16c)  are  cast,  and 
it  is  obvious  that  simplifications  can  be  achieved  by  proper  formulations. 
Particularly,  one  does  not  have  to  insist  that  each  model  equation  be 
solved  for  a  ,fdependentn  observation.  Such  a  form  is  assumed  in  most 
textbooks  on  data  reduction  and  postulated  in  computer  programs  for  data 
reduction  problems.  Quite  often  an  implicit  formulation  of  the  Eqs.  (16c) 
can  be  simpler,  producing  also  simpler  expressions  for  the  derivatives 
in  Eqs.  (16a)  and  (16b).  The  solution  of  the  problem  (11)  is,  of  course, 
independent  of  the  particular  form  in  which  the  model  equations  are 
cast.  This  remark  is  trivial  in  the  present  context,  and  it  is  a  conse¬ 
quence  of  the  formulation  of  the  model  fitting  problem  by  Eq.  (11). 
Reference  12  reports  about  numerous  unsuccessful  attempts  to  achieve  a 
similar  invariance  statement  when  the  problem  was  formulated  differently. 

The  aforementioned  manipulations  of  the  model  equations  can  also 
include  nonlinear  transformations  of  the  parameter  6 .  Such  transfor¬ 
mations  do  not  affect  the  definition  of  the  metric  p,  because  the  metric 
of  the  operator  is  independent  of  the  operand.  Therefore,  the  trans¬ 
formations  do  not  affect  the  first  term  in  Eq.  (16a)  either,  and  are  a 
 - 

P.A.D .  DeMaine ,  "Automatic  Curve  Fitting ,  J.  Test  Methods 3"  Computers 
and  Chemistry ,  Vol.  2,  pp.  1-6,  1978 . 
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powerful  tool  for  the  simplification  of  the  rest  of  the  equations.  An 
example  in  which  nonlinear  parameter  transformations  are  used  to 
linearize  the  model  equations  is  reported  in  Reference  9.  In  Section  IV 
we  shall  give  other  examples. 

The  formal  procedure  of  replacing  parameters  is  as  follows: 

Suppose  that  one  wants  to  replace  the  parameter  0  by  o  whereby  both 
parameters  are  related  by  a  nonsingular  function 


0  =  g(cr) . 


(17) 


(Regularity  of  the  transformation  need  to  be  assumed  only  within  a 
neighborhood  of  the  solution.)  Let  the  model  equations  be  in  terms 
of  o 


A  (X)ff  =  0 


(18) 


The  operator  A  can  be  obtained  from  A  always  by  the  definition 


A (X)a  =  A(X)g(a) 


(19) 


however,  often  one  can_  find  other  equivalent  formulations  that  are 
simpler.  The  metric  p  associated  with  A  is  defined  as  in  Eq.  (9) 


p  [A(Z),  A(X)]  =  1 1 z  -  x| I 


(20) 


With  this  definition  and  the  same  object  function  W{p}  as  before  one 
obtains  the  normal  equations 


J  W  {  I  |c|  I  f  -  -~[KTA(X  +  C)o]  =  0  ,  (21a) 

9C  9C 


[KTA(X  +  C)c]  =  0  , 


(21b) 


A(X  +  C)o  =  0  . 


(21c) 
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The  solution  vectors  of  Eqs.  (16)  and  Eqs.  (21)  are  related  by 


0  =  0,  0  =  g  (a)  (22) 

The  vectors  K  and  K  can  be  computed  from  these  values  using  formulas 
given  in  the  next  section. 

The  relation  (22)  is  again  a  simple  consequence  of  the  formulation 
(11)  of  the  model  fitting  problem.  Benda^  proves  the  correspondence 
(22)  for  a  particular  transformation  and  application  and  indicates  that 
previous  developers  of  software  for  such  problems  were  not  aware  of  the 
relation. 

If  the  solution  of  the  model  fitting  task  has  been  found  from 
Eq.  (21)  in  terms  of  a,  but  the  parameter  vector  0  is  of  interest,  then 
one  needs  also  a  formula  for  the  accuracy  of  0.  Let  us  assume  that  the 
solution  algorithm  for  Eq.  (21)  has  provided  information  about  the 
accuracy  of  o  in  form  of  an  estimate  Va  of  the  variance-covariance  matrix 
of  the  components  of  a.  (In  Section  III,  we  shall  give  formulas  for  Va 
in  least  squares  problems.)  Then  an  estimate  of  the  variance-covariance 
matrix  Vq  of  the  components  of  0  can  be  obtained  by  applying  the  linearized 
law  of  variance  propagation  to  the  relation  (22) .  The  result  is 


V  =  ^  V 
0  3a  a 


T 


(23) 


More  complicated  are  consequences  of  such  manipulations  of  the 
model  equations  that  involve  transformations  of  the  observations.  This 
is  so  because  the  transformations  now  affect  the  definition  of  the 
norm  p.  Next,  we  shall  consider  such  transformations. 

Let  a  transformation  of  observations  be 


Y  =  v(X) 


(24) 


with  the  inverse 


X  =  u(Y)  . 
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We  assume  that  the  transformation  is  regular  within  the  neighborhood  (5) , 
including  the  solution  X  +  C,  and  that  the  function  u(Y)  is  there  twice 
differentiable.  The  model  Eqs.  (1)  are  replaced  by  equivalent  (usually 
simpler)  equations 


A(Y)0  =  0  .  (25) 

A 

The  operator  A(Y)  can  be  obtained,  e.g.,  by  the  definition 

A(Y)0  =  A(u(Y))0  ,  (26) 


but,  as  in  the  case  of  parameter  transformations,  usually  other  equiva¬ 
lent  formulations  can  be  found  that  are  simpler. 

When  we  formulate  the  model  fitting  problem  in  terms  of  Y,  we  have 
to  keep  in  mind  that  the  goal  is  to  minimize  the  distance  C  between  the 
actual  observations  X  and  their  corrected  values  X  +  C.  In  least  squares 
problems,  only  such  a  minimization  yields  under  conditions  a  maximum 
likelihood  solution.  Then  the  minimization  problem  (11)  is 


Y  =  v(X)  , 


A(Y  +  B)0  =  0  ,  (27) 

wj|  |u(Y  +  B)  -  x|  |  }  =  min. 


The  normal  equations  for  the  problem  (27)  are 

|  |g  W{ | |u(Y  +  B)  -  X| |  }  -  |g  [KTA(Y  +  B)0]  =  0  ,  (28a) 

!q  [KTA(Y  +  B)0]  =  0  ,  (28b) 

A(Y  +  B)9  =  0  .  (28c) 
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The  first  term  in  Eq.  (28a)  is  not  linear  with  respect  to  the  unknown  B 
unless  the  transformation  (24)  is  linear.  Therefore,  a  nonlinear  trans¬ 
formation  that  produces  an  operator  A(Y)  which  is  simpler  than  the 
original  operator  A(X),  introduces  nonlinear  terms  in  Eq.  (28a).  The 
new  nonlinearities  may  offset  the  advantages  gained  by  a  simplification 
of  the  other  terms  in  the  equations. 

We  shall  pursue  this  point  further  in  the  next  section  and  show 
in  detail  how  the  normal  equations  and  algorithms  are  affected  by 
transformations  of  observations  specifically  in  least  squares  problems. 


III.  LEAST  SQUARES  MODEL  FITTING 

We  consider  in  this  section  the  effects  of  variable  transformations 
on  least  squares  model  fitting  problems.  We  shall  first  derive  the 
basic  equations  for  nonlinear  least  squares  problems  in  terms  of  the 
original  observations,  and  then  show  how  the  equations  are  affected 
by  a  transformation  of  the  observations.  We  simplify  our  notation  by 
defining  a  vector  function  F(X,0)  by 


F (X, 0)  =  A(X) 0 


(29) 


Then  the  model  Eq.  (1)  is 


F(X,0)  -  0  , 


(30) 


and  the  least  squares  model  fitting  problem  (14)  is 


F (X  +  c,t)  =  0  , 

W  =  | | c | | ^  =  c^R  1c  =  min. 


(31) 


In  the  sequel  we  will  use  subscripts  to  denote  derivatives.  Also, 
because  derivatives  of  F(X  4*  c,t),  with  respect  to  c,  are  identical  to 
derivatives  with  respect  to  X  we  shall  use  the  subscript  X  for  both. 
Thus,  e.g. , 

Fx(X+c,t)  =  F (X+c , t)  =  1^-  F(X+c,t) 


and 
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[KTF(X  +  c,t)]xt  -  [KTF(X  +  c,t>]  -  [kTF(X  +  c.t)] 


are  matrices  with  the  dimensions  rxn  and  nxp,  respectively. 

Using  this  notation,  the  normal  equations  corresponding  to  the 
problem  (31)  are 


R  1c  -  [kTFx(X  +  c,t)]T  =  0  , 

(32a) 

kTFt(X  +  c,t)  =  0  , 

(32b) 

F  (X  +  c ,  t)  =0  . 

(32c) 

The  normal  equations  are  in  general  nonlinear  with  respect  to  c 
and  t.  Therefore,  their  numerical  solution  will  require  some  kind  of 
iteration.  We  obtain  second  order  iteration  equations  for  Eqs.  (32)  by 
expanding  the  normal  equations  at  an  approximation  to  the  solution  and 
keeping  the  linear  terms  of  the  expansion.  Let  the  approximation  to  the 
solution  be  C,  K,  and  T,  and  that  of  the  corresponding  corrections  be 
e,  k ,  and  x.  Then  the  expansion  yields  the  following  Newton  equations 
for  the  corrections: 


[l-R(KTF)xx]e  -  RF^- (K  +  k)  -  R(k' 
(KTF)tXe  +  fL(K  +  k)  +  (KTF)ttT 
F  e  +  Ft  =  -  F 

A  t 


F)xtt-  -  C  , 

(33a) 

-  o  , 

(33b) 

(33c) 

The  arguments  of  F  and  its  derivatives  in  Eqs.  (33)  are  X  +  C  and  T. 
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Newton-Raphson  iteration  equations  can  be  established  by  suitable 
manipulations  of  Eqs.  (33)® *13 *14,15 ^  A  set  of  such  iteration  equations 
are  given  in  the  Appendix,  Most  authors  simplify  Eqs.  (33)  by  neglecting 
all  terms  that  contain  second  order  derivativesl>H>16>17.  This  yields 
so-called  Gauss-Newton  procedures  that  have- theoretically  only  linear 
convergence  and  that  also  may  have  other  peculiarities^-® . 

The  final  step  in  a  model  fitting  problem  is  to  obtain  variance 
estimates  of  the  solution  in  terms  of  the  estimated  variances  of  the 
observations.  We  shall  restrict  ourselves  in  this  article  to  the  esti¬ 
mation  of  the  accuracies  of  the  least  squares  value  t  of  the  parameter 
vector,  and  show  how  the  estimation  formulas  change  due  to  transformations 
of  observables.  We  shall  use  the  linearized  variance  propagation  formula 
for  the  estimates.  Estimates  of  the  accuracies  of  the  corrected  obser¬ 
vations  x  -  X  +  c  can  be  obtained  by  analogous  processes. 

The  formulas  can  be  derived  from  the  linear  terms  of  an  expansion 
of  the  normal  Eqs.  (33)  at  the  solution^®.  Let  dx,  dk,  and  dt  be  the 
differentials  of  the  solution  vectors  x  =  X  +  c,  k,  and  t,  respectively. 
Then  the  expansion  yields 


[(I  -  RkTF)xx]dx  - 

RFxTdk  -  R(kTF)X(.dt  =  dX  , 

(34a) 

(kTF) tXdx  +  F^dk  + 

(kTF)t(.dt  =  0  , 

(34b) 

Fxdx  + 

Ffcdt  =  0  . 

(34c) 

The  arguments  of  F  and  its  derivatives  in  Eqs.  (34)  are  x  and  t. 
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By  manipulations  of  Eqs.  (34)  that  can  be  done  in  various  ways!3,18 
one  obtains  linear  relations  between  dt  and  dX,  and  between  dx  and  dX, 
respectively.  Let  the  former  relation  be 


N  dt  =  S  dX  . 


(35) 


(Explicit  formulas  for  N  and  S  are  given  in  the  Appendix.)  Then  the 
estimated  variance-covariance  matrix  V^_  of  the  parameter  vector  t  is 


-1  T  -1  T 

N  S  R  S  (N  ) 


(36) 


It  is  obvious  from  the  derivation  of  Eq.  (36)  that  V  ,  which  itself 
is  only  a  linearized  approximation,  depends  on  second  order  derivatives 
of  F.  (The  formulas  in  the  Appendix  show  explicitly  this  dependency.) 
Neglect  of  the  second  order  derivative  terms  renders  a  formula  that  is 
theoretically  less  than  first  order  accurate.  Therefore,  such  a  neglect 
has  to  be  justified  in  each  application  by  providing  estimates  of  the 
magnitudes  of  the  neglected  terms.  Of  the  cited  references,  complete 
first  order  formulas  are  used  only  in  References  13,  14,  15,  and  18. 

Next,  we  introduce  variable  transformations  into  the  least  squares 
model  fitting  problem.  We  can  restrict  ourselves  to  transformations 
of  observations  because,  as  shown  in  Section  II,  transformations  of 
model  parameters  have  the  same  effects  as  simple  algebraic  manipulations 
of  the  model  equations. 

Let,  as  in  Section  II,  the  transformation  be  given  by 


Y  =  v(X) 


(37) 


with  the  inverse 


X  =  u  (Y)  . 


In  terms  of  Y,  the  least  squares  model  fitting  problem  is  defined  by 


Aivars  Celmiy&j  " Least  Squares  Adjustment  with  Finite  Residuals  for 
Ron-Linear  Constraints  and  Fartially  Correlated  Data rr  Ballistic 
Research  Laboratory  Report  BRL-R-16583  1973 .  (AD  #766283) 
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Y  “  v(X)  ,  (38a) 

H(Y  +  b,t)  -  0  ,  (38b) 

W  =  | | u(Y  +  b)  -  X| | 2  -  [u(Y  +  b)  -  X]TR_1  [u(Y  +  b)  -  X]  =  min. 

(38c) 

Eq.  (38b)  is  a  model  equation,  equivalent  to  Eq.  (30)  and  expressed  in 
terms  of  Y. 

The  normal  equations  for  the  problem  (38)  are 

[^(Y  +  b )]TR-1  [u(Y  +  b)  -  X]  -  [k^CY  +  b,t)]T=  0  (39a) 

kTHt(Y  +  b,t)  =  0  ,  (39b) 

H(Y  +  b,t)  =  0  ..  (39c) 

Corresponding  Newton  equations  for  corrections  (3,  k,  and  x  of  approxi¬ 
mate  solutions  B,  K,  and  T,  respectively,  are 

[i  -  QH] 3  -  QH^  •  (K  +  k)  -  Q(KTH)YtT  =  -  A  ,  (40a) 

(KTH)tY8  +  H*  •  (K  +  k)  +  (KTH)ttT  =  0  ,  (40b) 

HyB  +  H  x.  =  -  H  ,  (40c) 

where 

Q  =  vxRvxT  =  (uy)_1R  (u^)_1  >  (41) 

A  =  vx  •  [u(Y  +  B)  -  X]=  vx  •  C  =  (uy)"1  •  C  ,  (42) 
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(43) 


3  =  (KTH)yy  -  (uTR  1C)Yy  . 


The  arguments  of  the  functions  H  and  u  in  Eqs.  (40)  through  (43)  are 
Y  +  B  and  T,  and  the  last  term  in  Eq.  (43)  is  differentiated  assuming 
C  =  u(Y  +  B)  -  X  to  be  constant.  The  term  is  a  symmetric  nxn  matrix 
containing  second  order  derivatives  of  the  transformation  function  u(Y). 

A  comparison  of  Eqs.  (40)  with  Eqs.  (33)  shows  that  the  important 
changes  in  the  Newton  equations  due  to  the  transformation  (37)  are  in 
Eqs.  (40a),  The  rest  of  Eqs.  (40)  is  formally  identical  to  the  corres¬ 
ponding  terms  in  Eqs.  (33),  if  F(X,0)  is  replaced  by  H(Y,0).  In  Eqs.  (40a) 
we  see  three  other  replacements:  the  estimated  variance-covariance 
matrix  R  is  replaced  by  Q,  the  right-hand  side  -C  is  replaced  by  -A,  and 
the  term  (kTf)xx  is  replaced  by  5. 

The  replacement  of  R  by  Q  corresponds  to  an  application  of  the 
linearized  variance  propagation  formula  to  the  transformation  (37) . 

The  replacement  of  the  right-hand  sides  is  a  linearized  transformation 
of  the  residuals  C  into  the  Y-space.  If  the  transformation  (37)  is 
linear,  then  only  these  two  replacements  occur.  If,  however,  the  trans¬ 
formation  is  nonlinear,  then  the  last  term  in  Eq.  (43)  does  not  vanish 
and,  because  it  contains  second  order  derivatives  of  u(Y),  it  can  be 
quite  complicated.  This  complication  can  offset  algorithmic  advantages 
gained  by  a  simplification  of  other  terms  in  the  Newton  equations. 

Iteration  algorithms  and  formulas  for  the  variances  of  the  solution 
again  can  be  obtained  by  manipulations  of  the  Newton  equations.  Explicit 
formulas  are  given  in  the  Appendix.  We  notice  that  second  order  Newton- 
Raphson  algorithms  necessarily  contain  second  order  derivatives  of  the 
model  function  H  as  well  as  of  the  transformation  function  u(Y).  The 
coding  of  the  second  order  derivatives  can,  of  course,  be  avoided  if 
first  order  Gauss-Newton  algorithms  are  used.  However,  variance  esti¬ 
mates  of  the  solution  can  be  calculated  to  a  first  order  accuracy  only 
if  all  the  second  order  derivatives  are  available. 

The  author  has  carried  out  numerical  experiments  to  determine  whether 
a  solution  of  Eqs.  (39)  instead  of  Eqs.  (32)  has  algorithmic  advantages. 

The  experiments  were  done  with  the  utility  programs  described  in  Refer¬ 
ence  15.  The  programs  permit  one  to  carry  out  the  calculations  either 
in  terms  of  X,  or  in  terms  of  Y,  and  to  use  either  Newton-Raphson,  or 
Gauss-Newton  algorithms.  The  experiments  were  inconclusive.  In  some 
examples  the  algorithms  converged  better  when  the  problem  was  formulated 
in  X,  in  other  examples  a  formulation  in  Y  =  v(X)  produced  better 
algorithms.  However,  the  differences  in  performance  were  never  signifi¬ 
cant.  This  result  is  in  strong  contrast  to  similar  experiments  involving 
transformations  of  parameters.  In  those  experiments,  a  suitable  param¬ 
eter  transformation  often  had  a  dramatic  effect  on  the  performance  of 
the  solution  algorithm.  Some  examples  are  given  in  the  next  section. 
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Another  possible  benefit  from  nonlinear  transformations  of  obser¬ 
vations  could  be  a  simpler  problem  formulation*  The  complexity  of  the 
normal  equations  is  thereby  of  secondary  importance,  if  one  uses  an 
available  general  utility  program  for  their  solution.  However,  the  model 
equations  must  be  made  available  to  the  utility  program,  which  means  that 
the  equations  must  be  programmed*  Then  one  has  the  choice  to  program 
either  the  function  F(X,0)  with  its  first  and  second  order  derivatives, 
or  the  two  functions  H(Y,0)  and  u(Y)  with  their  derivatives.  If  the 
transformation  is  nonlinear,  then  normally  the  programming  of  H  and  u 
will  not  be  simpler  than  the  programming  of  F.  An  exception  may  be  the 
situation  where  the  same  transformation  u(Y)  (e.g.,  polar-cartesian)  is 
used  for  several  problems  with  different  model  functions  H(Y,0),  so  that 
u(Y)  has  to  be  programmed  only  once. 

We  may  conclude  that  in  general  a  transformation  of  observations 
offers  little  or  no  advantages  over  a  formulation  of  the  model  equations 
in  terms  of  the  original  Observations.  There  are,  however,  other  useful 
applications  of  such  transformations.  First,  a  graphical  display  of 
the  results  can  be  clearer  in  terms  of  Y  than  in  terms  of  X.  Second, 
and  more  importantly,  the  transformations  can  be  a  convenient  method  to 
derive  a  "falsified"  problem  that  can  be  solved  easily  and  that  provides 
initial  approximations  to  the  unknown  least  squares  solution  vectors. 

One  can  falsify  the  problem;  e.g.,  by  using  a  nonlinear  transformation 
but  linearizing  its  effects  on  the  problem  formulation.  A  simple  and 
effective  falsification  is  to  replace  the  problem  (38)  by 


Y  -  v(X)  ,  (44a) 

H(Y  +  b,t)  -  0  ,  (44b) 

W*  =  bT  [u^(Y)R-1uY(Y)]b  =  min.  (44c) 

The  formulation  is  identical  to  the  correct  formulation  (38)  only  if  the 
transformation  is  linear,  but  the  normal  equations  for  the  false  problem 
(44)  are  simple: 


Q-1b  -  [kV^CY  +  b,t)]T  =  0  , 

(45a) 

kTHt(Y  +  b,t)  =  0  , 

(45b) 

H(Y  +  b,t)  =  0  , 

(45c) 
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where 


q  -  [uy('0]"1k  [-4  m]"1  • 


This  system  can  be  much  simpler  and  easier  to  solve  than  Eqs.  (32)  or 
the  equivalent  Eqs.  (39).  Its  solution  is,  however,  not  the  least 
squares  solution  but  an  approximate  solution  of  unknown  quality. 

Initial  approximations  to  the  solution  also  can  be  obtained  by 
other  falsifications  in  addition  to  the  one  described,  or  instead  of  it. 
Such  falsifications  are,  e.g.,  assumptions  that  certain  observations  are 
error  free,  that  some  correlations  are  zero,  that  some  model  parameters 
have  prescribed  values,  etc. 


IV.  EXAMPLES 

The  first  example  is  a  case  involving  transformation  between  polar 
and  cartesian  coordinates.  We  shall  compare  results  that  are  obtained 
using  the  approach  of  the  previous  section  with  results  that  are  obtained 
by  following  suggestions  by  other  authors.  In  data  processing  literature 
one  finds  different  suggestions.  The  simplest  one  is  to  treat  the  prob¬ 
lem  after  transformation  as  if  the  transformed  quantities  were  observed. 
It  is  clear  from  the  discussions  in  Section  II  that  such  an  approach 
does  not  produce  the  least  squares  solution,  i.e.,  it  does  not  minimize 
Wl||c||},  even  if  the  transformation  is  linear.  The  most  sophisticated 
suggestion!. »> 10  ±s  to  apply  the  transformation  (46)  to  R,  i.e.,  to 
solve  the  system  (45).  As  we  have  seen  in  the  previous  section,  this 
approach  yields  the  least  squares  solution  only  if  the  transformation 
Y  =  v(X)  is  linear.  The  following  example  illustrates  the  practical 
consequences  of  such  a  problem  falsification. 

Let  the  observations  be  distances  r.  and  azimuth  angles  cf) . ,  and  let 
the  model  equations  represent  a  straight1line  in  cartesian  coordinates. 
Then  the  model  equations  are  in  terms  of  the  original  observations 


r^sin<f>^  -  a  -  br^cos^  =  0 


F(r,<f>;a,b) 


< 


r2sin<f>2 


a  -  br2cos4>2  =  0 


(47) 


rn s in  4>n  -  a  -  br^cos^  =  0 
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The  transformation  of  the  observations  into  cartesian  coordinates  is 


x±  -  r±  cos$±  , 


yi  ■  sin^ 


1,  2, 


n 


(48) 


in  terms  of  the  transformed  observations 

-  a  -  bx^  =  0 

-  a  -  bx2  -  0 

(49) 

y  -  a  -  bx  =  0 
n  n 

1  « 

The  Jacobian  matrix  of  the  transformation  is 

(50) 


where 

9(Xi,yi)  (cosh  -risin<j>i 
1  3(r1,<J>i)  =  ysin<))i  r^cos<J>^ 

We  assume  for  simplicity  that  all  observations  are  independent 
with  estimated  standard  errors  er^  and  e^,  respectively.  Then  the 
estimated  variance-covariance  matrix  R  is  the  diagonal  matrix 


(51) 


and  the  model  equations  are 


H(X,Y;a,b)  -  < 
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The  transformed  variance-covariance  matrix  Q  is  according  to  Eq.  (46) 
the  block  diagonal  matrix 


Q  =  JRJ  = 


q2  0 


n 


(53) 


where 


2  2  2  2  2 
fe  ,cos  <b_+e,  ,r  .sin  <b, 
ri  i  <J)i  i  Yi 

^i"  12  2  2 

(e  ,-e.  ,r  ,)sincb  .cos6. 
ri  (j)i  i'  Yi  Yi 


2  2  2 

(e  ,-e,  .r . ) sind) . cosd) , 
ri  <J)i  i  i  i 


2  2  2  2  2 
e  .sin  6.+eA .r .cos  6. 
ri  l  4>i  l  Yi 


(54) 


For  a  numerical  example,  we  take  the  ten  points  listed  in  Table  1 
as  observations  and  assume  that  their  standard  errors  are 


e  .  =  0.048,  e±.  »  27.5°,  i  ■  1,  2,  ...,  n.  (55) 

We  made  three  adjustments.  First,  the  r,(|>-data  were  used  together 
with  the  model  Eqs.  (47).  In  the  second  adjustment,  the  x,y-data  were 
used  together  with  the  model  Eqs.  (49)  and  the  transformation  function 
(48)  in  a  utility  program^  based  on  the  normal  Eqs.  (37).  The  results 
of  both  adjustments  were  identical,  as  they  should  be,  and  they  are 
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TABLE  1.  OBSERVATIONS  4  AND  r  AND  CORRESPONDING 
CARTESIAN  COORDINATES 


4_ L 


206.6° 

0.559 

26.6° 

1.342 

26.6° 

2.236 

26.6° 

3.354 

26.6° 

4.472 

123.7° 

1.803 

92.9° 

1.952 

68.2° 

2.693 

52.4° 

4.100 

42.0° 

6.727 

Z 


-0.50 

-.025 

1.20 

0.60 

2.00 

1.00 

3.00 

1.50 

4.00 

2.00 

-1.00 

1.50 

-0.10 

1.95 

1.00 

2.50 

2.50 

3.25 

5.00 

4.50 

TABLE  2.  ADJUSTMENT  RESULTS 


Case  1  and  2  (Original  and  Transformed  Problem) 
a  =  0.381  ±  0.298 

b  =  1.141  ±  0.744  c  ,  =  0.015065 

ab 

m  =  1.24541 
o 

Case  3  (Falsified  Problem) 
a  =  0.680  ±  0.407 

b  =  1.837  ±  0.259  c  =  -0.568659 

ab 

m  =  1.75646 
o 

The  standard  error  of  weight  one,  m  ,  is  not  included  in  the  standard 
errors  of  the  parameters. 
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listed  in  Table  2.  The  listed  standard  errors  of  the  parameters  are  the 
square  roots  of  the  diagonal  elements  of  Vt,  computed  with  formula  (36). 
The  correlation  coefficient  c^  is  the  off-diagonal  element  of  the  cor¬ 
relation  matrix  C  ,  defined  by 


C 

t 


(56) 


where  is  the  diagonal  matrix  of  V  .  The  standard  error  of  weight 
one  is  defined  by  *■ 


m 

o 


T  -1 
c  R  c 


-1/2 


[  W] 

L  n-p  J 


1/2 


(57) 


Figure  la  shows  the  result  of  the  adjustment  in  the  <{>,r-plane, 
i • e •  >  the  plane  of  the  original  observations.  The  accuracies  of 
the  observations  are  indicated  by  error  ellipses  around  the  observed 
points.  The  adjustment  is  indicated  by  connecting  the  observed  points 
with  the  corresponding  corrected  locations  on  the  fitted  curve.  The 
figure  shows  that  all  adjustments  are  in  the  direction  of  largest 
uncertainties. 

Figure  lb  shows  the  same  result  in  the  x,y-plane.  The  accuracies 
of  the  transformed  observations  are  again  indicated  by  error  ellipses, 
corresponding  to  the  transformed  variance-covariance  matrices  Q^.  In 
this  presentation  the  adjustments  seem  to  be  in  directions  other  than 
those  with  largest  uncertainties.  This  is  typical  for  nonlinear  trans¬ 
formations  of  observations.  The  object  of  the  fitting  is  to  minimize 
residuals  of  the  original  observations.  The  presentation  in  the  x,y-plane 
is  distorted  by  the  nonlinearity  of  the  transformation. 

In  a  third  adjustment  we  used  the  x,y-data,  the  model  Eq.  (49),  and 
the  variance-covariance  matrix  Q,  defined  by  Eq.  (53).  The  treatment, 
suggested  by  Demingl  and  other  authors,  was  described  in  Section  III, 

Eqs.  (44)  through  (46),  as  a  falsification  of  the  problem.  The  numerical 
results  of  this  adjustment  are  listed  in  Table  2.  They  are  different 
from  the  previous  results,  and  the  increase  of  m  indicates  that  the 
solution  is  not  optimal.  We  notice  also  that  the  correlation  coefficient 
cab  has  changed  its  magnitude  and  sign. 

Figure  2b  shows  the  results  of  the  adjustment  in  the  x,y-plane. 

It  indicates  that  the^ adjustment  would  indeed  be  optimal,  if  x,y  were 
the  observations  and  Q  were  their  variance- covariance  matrix.  However, 
when  the  same  results  are  plotted  in  the  cj),y-plane.  Figure  2a,  then  it 
becomes  obvious  that  the  adjustment  has  not  achieved  the  goal  to  minimize 
the  residuals  of  the  original  observations  <J),r.  The  treatment  of 
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0  60  120  180  2H0  300 

<i> 


Figure  la.  Adjustment  in  <f>,r-space. 

The  data  are  shown  with  their  one  standard  error  ellipses  and  the 
adjusted  curve  is  shown  with  one  standard  error  confidence  limits. 
The  same  results  are  shown  by  Figure  lb  in  the  cartesian  x,y-plane. 
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X 

Figure  lb.  Adjustment  in  <|>,r-space. 


The  transformed  data  are  shown  with  their  one  standard  error  ellipses 
and  the  adjusted  line  is  shown  with  one  standard  error  confidence 
limits.  The  same  results  are  shown  by  Figure  la  in  the  <f>,r-plane  of 
observations. 
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* 


Figure  2a.  Falsified  Adjustment  in  x,y-space. 

The  data  are  shown  with  their  one  standard  error  ellipses  and  the 
adjusted  curve  is  shown  with  one  standard  error  confidence  limits. 
The  same  results  are  shown  in  Figure  2b  in  the  cartesian  x,y-plane. 
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Figure  2b.  Falsified  Adjustment  in  x,y-space. 

The  transformed  data  are  shown  with  their  one  standard  error  ellipses 
and  the  adjusted  line  is  shown  with  one  standard  error  confidence 
limits.  The  same  results  are  shown  in  Figure  2a  in  the  <}>,r-plane  of 
observations. 
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transformations  of  observations  in  this  form  is  a  falsification  of  the 
problem.  The  results  are  approximations  to  the  least  squares  solution, 
but  since  the  quality  of  the  approximations  is  not  known,  they  may  be 
useful  only  as  initial  approximations  for  a  least  squares  algorithm. 
However,  in  a  case  like  this  example,  an  initial  approximation  could  be 
simpler  obtained,  e.g.,  graphically  by  drawing  a  straight  line  in  the 
x,y-plane  through  the  observations* 

Next,  we  present  an  example  for  the  linearization  of  parameters. 
Let  the  model  equation  be 


y  -  AxB  exp  (^)  -  0  , 


(58) 


where  x  and  y  are  observations  and  A,  B,  and  C  are  model  parameters. 
An  equivalent  model  formulation  is 


lny  -  a  -  b  lnx  -  —  =  0  . 


(59) 


x 


In  Eq.  (59)  the  parameters  a,  b,  and  c  enter  linearly.  One  can  expect 
a  much  better  performance  of  solution  algorithms  if  Eq.  (59)  is  used. 
The  parameter  transformation  is  in  this  example 


a 


B  -  b  , 


(60) 


C  =  c  , 


and  the  Jacobian  matrix,  needed  in  Eq.  (23)  is 


(61) 


Another  example  is  the  trigonometric  model 


»  x-B  . 

y  -  A  cos  =  0  . 


(62) 
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An  equivalent  model  is 


y  -  a  sin(cx)  -  b  cos(cx)  . 

The  corresponding  parameter  transformation  is 


(63) 


a  =  A  sin(B/C)  ,  . 

b  =  A  cos(B/C)  ,  )  (64) 

c  =  1/C  ,  ' 

with  the  Jacobian  matrix 


3(A,B,C)  1 9(a,b,c)  \  -1 

9 (a ,b  ,c)  y 9 (A,B, C)  ) 


( sin(B/C)  (A/C)cos(B/C) 
1  cos(B/C)  -(A/C)sin(B/C) 
\  0  0 


-(AB/C  )cos(B/C) 
(AB/C2)sin(B/C) 

-l/c2 


-1 


(65) 


In  this  example,  the  model  (63)  is  linear  only  with  respect  to  two 
parameters.  However,  the  difference  between  numerical  treatments  of  the 
problem  is  dramatic  if  one  uses  Eq.  (62)  or  Eq.  (63),  respectively.  In 
numerical  experiments  we  found  that  in  order  to  achieve  convergence, 
one  had  to  start  with  parameter  values  A,B,C  that  were  very  close  to 
their  least  squares  values.  Using  the  parameters  a,b,c  and  the  model 
Eq.  (63),  one  achieves  fast  convergence,  e.g.,  with  the  initial  values 
a=b=0 . 


V.  SUMMARY  AND  CONCLUSIONS 

Manipulations  of  model  equations  that  produce  simpler  but  equiva¬ 
lent  equations  can  greatly  facilitate  the  preparation  of  least  squares 
problems  (e.g.,  computer  programming)  for  utility  routines.  The  manip¬ 
ulations  can  also  improve  the  performance  of  numerical  algorithms.  If 
the  manipulations  are  merely  algebraic  and/or  involve  nonlinear  trans¬ 
formations  of  the  model  parameters,  then  their  application  is  straight¬ 
forward  and  their  implementation  simple.  If,  however,  the  manipulations 
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include  transformations  of  observations,  then  one  has  to  transform  also 
the  normal  equations  correspondingly.  Neglect  of  this  transformation 
falsifies  the  problem  and  produces  results  that  are  of  unknown  quality 
and  equally  reliable  as,  e.g.,  a  graphical  construction  of  a  fitting 
curve.  A  correct  implementation  of  transformations  of  observations 
requires  the  programming  of  the  transformation  function,  including  its 
first  and  second  order  derivatives.  It  also  does  not  improve  the  per¬ 
formances  of  algorithms.  Therefore,  in  most  cases,  it  is  more  efficient 
to  formulate  the  model  equations  in  terms  of  the  original  observations, 
thereby  avoiding  the  programming  of  the  transformation  function. 

The  need  for  second  order  derivatives  of  the  model  equations  has 
been  often  overlooked.  In  order  to  avoid  the  programming  of  these 
derivatives,  most  authors  suggest  to  use  a  first  order  Gauss-Newton 
algorithm  for  the  solution  of  the  normal  equations,  instead  of  a  second- 
order  Newton-Raphson  algorithm.  The  performance  of  the  former  may  be 
often  comparable  to  the  latter,  because  even  with  more  iterations,  the 
computing  effort  can  be  less  due  to  the  simpler  equations.  Second-order 
derivatives  of  the  model  equations  (and  of  the  transformation  function) 
are,  however,  needed  to  compute  the  linear  terms  in  formulas  for  variance 
estimates  of  the  results.  Their  neglect  cannot  be  justified  cursorily 
by  the  argument  that  linearized  model  equations  are  already  second  order 
accurate  and,  therefore,  their  second  order  derivatives  are  not  needed. 

It  can  be  shown  that  the  linearized  normal  equations  do  contain  these 
derivatives  and,  therefore,  are  needed  in  the  linearized  variance  propa¬ 
gation  formula.  Formulas  for  variance  estimates  that  do  not  contain 
second  order  derivatives  are  less  than  first  order  accurate. 
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APPENDIX 

DERIVATION  OF  ITERATION  FORMULAS 
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We  provide  a  set  of  iteration  formulas  that  are  derived  from 


the  Newton  equation  (34)  by  algebraic  manipulations, 
the  following  matrices: 

First,  we  define 

g  -  (F^r1 

(A-l) 

A  =  RF^GF  -I 

X  X 

(A-2) 

r  =  [i+ar (ktf ) xx ]-1 

(A- 3) 

Eo  =  r*[AC-RF^GFx] 

(A-4) 

E1  =  r-[RFxGFt+AR(KTF)xt] 

(A-5) 

Do  ■  (KTp)tX  -  FtGFXR(KlF)XX 

(A-6) 

Di  -  -  FtGFxE(KTF)xt 

(A-7) 

N  =  F^GFt  -  +  DQE 

(A-8) 

The  iteration  equations  are 

Nt  =  F^G(FXC-F)  +  DqE0 

(A-9) 

K+k  =  G(FxC-F)+G[Ft+FxR(KTF)xt]x-GFxR(KTF)xxe 

(A-10) 

e  =  Eq-E1t  . 

(A-ll) 

Numerical  experiments  have  shown  that  the  convergence  of  the 
iteration  is  enhanced  if  the  equations  are  used  in  a  subiteration 
mode  by  iterating  alternatively  on  the  parameters  and  residuals, 
respectively.  For  parameter  subiteration  only  equations  (A-9)  and 
(A-10)  are  used,  assuming  For  residual  subiteration  one  sets 

x=0  and  uses  equations  (A-10)  and  (A-ll) . 
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In  the  variance  formula  (36)  one  uses  N,  defined  by  equation 
(A- 8)  and 


S  =  F^GFX  +  DQrA.  (A-12) 

* 

Another  equivalent  set  of  Newton- Raphson  iteration  equations  is 
given  in  Reference  13.  None  of  the  sets  is  numerically  superior  to 
the  other,  and  both  require  subiterations  of  parameters  and  residuals 
for  efficiency. 

Gauss-Newton  iteration  equations  can  be  obtained  from  Newton- 
Raphson  iteration  equations  by  setting  all  second  order  derivatives 
equal  to  zero.  The  convergence  of  Gauss-Newton  algorithms  is  inferior, 
but  in  some  applications  they  have  a  larger  domain  of  convergence. 

Iteration  equations  for  least  squares  problems  with  transformations 
of  observations  can  be  obtained  from  the  formulas  in  this  Appendix 
by  substituting 


Q  for  R 
A  for  C 

and 


H  for  O^F)^ 


Expressions  for  Q,  A,  and  E  in  terms  of  the  model  and  the  transformation 
functions  are  given  in  Section  Ill,  equations  (41) ,  (42) ,  and  (43) . 
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ATTN:  DRDMI-RDK 
Redstone  Arsenal,  AL  35809 

2  Commander 

US  Army  Tank  Automotive  Research 
and  Development  Command 
ATTN:  DRDTA-UL 

DRDTA-ZSS/T.  Szako 
Warren,  MI  48090 

2  Commander 

US  Army  Nuclear  Agency 
ATTN:  ACTA-NAW 

Technical  Library 
7500  Backlick  Road,  Bldg.  2073 
Springfield,  VA  22150 

1  Director 

US  Army  TRADOC  Systems 
Analysis  Activity 
ATTN:  ATAA-SL/Tech  Lib 
White  Sands  Missile  Range, 

NM  88002 

1  Commander 

US  Army  Jefferson  Proving  Ground 
ATTN:  STEJP-TD-D 

Madison,  IN  47250 

1  Commander 

US  Army  Materials  and 

Mechanics  Research  Center 
ATTN:  DRXMR-ATL 
Watertown,  MA  02172 

1  Commander 

US  Army  Concepts  Analysis  Agency 
8120  Woodmont  Avenue 
Bethesda ,  MD  20014 
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1  Director 

Strategic  Systems  Project  Ofc 
ATTN:  NSP-43/Tech  Lib 
Department  of  the  Navy 
Washington,  DC  20360 

3  Commander 

US  Naval  Air  Systems  Command 
ATTN:  AIR-604 
Washington,  DC  20360 

3  Commander 

US  Naval  Ordnance  Systems 
Command 

ATTN:  ORD-9132 
Washington,  DC  20360 

2  Commander  and  Director 
David  W.  Taylor  Naval  Ship 

Research  and  Development  Ctr 
ATTN:  Tech  Lib 

Aerodynamic  Lab 
Bethesda,  MD  20084 

1  Commander 

US  Naval  Ordnance  Station 
ATTN:  Code  FS13A/P .  Sewell 
Indian  Head,  MD  20640 

1  Commander 

US  Naval  Research  Laboratory 
ATTN:  Tech  Info  Div 
Washington,  DC  20375 

2  Commander 

US  Naval  Surface  Weapons  Ctr 
ATTN:  Code  WX21/Tech  Lib 

Code  240/C. J.  Aronson 
Silver  Spring,  MD  20910 

1  Commander 

US  Naval  Weapons  Center 
ATTN:  Code  553/Tech  Lib 

China  Lake,  CA  93555 


1  AFRPL/LKCB (Dr .  Horning) 

Edwards  AFB,  CA  93523 

1  AFATL(Tech  Lib) 

Eglin  AFB,  FL  32542 

1  AFWL/DE-I 

Kirtland  AFB,  NM  87117 

1  AFWL/DEX 

Kirtland  AFB,  NM  87117 

1  AFWL/SUL 

Kirtland  AFB,  NM  87117 

1  AFWL/R.Henny 

Kirtland  AFB,  NM  87117 

1  ASD/XRA(Stinfo) 

Wright-Pat terson  AFB,  OH  45433 

1  National  Oceanic 

and  Atmospheric  Admin  (C121) 
ATTN:  A.F.  Pope 
Rockville,  MD  20852 

1  National  Center  for 

Atmospheric  Research 
ATTN:  R.M.  Passi 

Boulder,  CO  80803 

3  National  Bureau  of  Standards 
ATTN:  D.  Shier 

D .  Kahaner 
W.S.  Horton  (561) 
Washington,  DC  20234 

1  US  Geological  Survey 
ATTN:  J.R.  Fisher 

Washington,  DC  20242 

1  National  Heart  and  Lung  Institute 
Biometrics  Research  Branch 
ATTN:  J.H.  Ware 

Bethesda,  MD  20014 
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1  National  Institute  of  Health 
ATTN:  R.I.  Shrafer, 

Bldg.  12A,  Rm2055 
900  Rockville  Pike 
Bethesda,  MD  20205 

1  Director 

Brookhaven  National  Laboratory 
Associate  Universities,  Inc. 
ATTN:  W.H .  Sachs 

Building  801 

Upton,  Long  Island,  NY  11973 
1  Director 

Lawrence  Livermore  National 
Laboratory 
ATTN:  Tech  Lib 

P.0.  Box  969 
Livermore,  CA  94550 

1  Director 

Lawrence  Livermore  National 
Laboratory 

ATTN:  Tech  Info  Dept  L-3 
P.0.  Box  808 
Livermore,  CA  94550 

1  Director 

Los  Alamos  Scientific  Lab 
ATTN:  Doc  Contr  for  Rpts  Lib 
P.0.  Box  1663 
Los  Alamos,  NM  87544 

1  Director 

Oak  Ridge  National  Laboratory 

Nuclear  Division 

ATTN :  D . J . W .  Longwor th 

P.0.  Box  Y 

Oak  Ridge,  TN  37830 

1  Sandia  Laboratories 

ATTN:  Doc  Contr  for  3141 

Sandia  Rpt  Collection 
P.0.  Box  5800 
Albuquerque,  NM  87115 
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1  Director 

Jet  Propulsion  Laboratory 
ATTN:  Tech  Lib 
2800  Oak  Grove  Drive 
Pasadena,  CA  91103 

1  Director 

National  Aeronautics  and 
Space  Administration 
Langley  Research  Center 
ATTN:  MS  185 /Tech  Lib 
Langely  Station 
Hampton,  VA  23365 

2  Director 

National  Aeronautics  and 
Space  Administration 
George  C.  Marshall  Space 
Flight  Center 
ATTN:  MS-l/Lib 

R-AERO-AE/A.  Felix 
Huntsville,  AL  35812 

1  Director 

NASA  Scientific  and  Technical 
Information  Facility 
ATTN:  SAK/DL 
P.0.  Box  8757 
BWI  Airport,  MD  21240 

1  Aerospace  Corporation 

Performance  Analysis  Department 
ATTN:  J.T.  Betts 

El  Segundo,  CA  90245 

1  ARO ,  Inc . 

ATTN:  Tech  Lib 

Arnold  AFS ,  TN  37389 

1  The  Boeing  Company 

ATTN:  Aerospace  Library 
P.0.  Box  2707 
Seattle,  WA  98124 
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2  E.I.  Du  Pont  de  Nemours  Co 
Engineering  Department 
ATTN:  R.D .  Snee 

C.H.  White 

Wilmington,  DE  19898 

1  General  Electric  Company-TEMPO 
ATTN :  E .  Bryant 
220  S.  Main  Street,  Rm  206 
Bel  Air,  MD  21014 

1  General  Electric  Company 
Armament  Department 
ATTN:  W.  Hathaway 
Lakeside  Avenue 
Burlington,  VT  05401 

1  General  Electric  Company-TEMPO 
ATTN:  DASIAC 
P  .0 .  Drawer  QQ 
Santa  Barbara,  CA  93102 

1  Pacific  Sierra  Research  Corp. 
ATTN:  H.  Brode 
1456  Cloverfield  Boulevard 
Santa  Monica,  CA  90404 

1  Physics  International  Corp, 
ATTN:  Tech  Lib 
2700  Meced  Street 
San  Leandro,  CA  94577 

1  R&D  Associates 
ATTN:  Tech  Lib 
P.0.  Box  9695 
Marina  del  Rey,  CA  90291 

1  RCA  Corporation 

David  Sarnoff  Research  Center 
ATTN:  Head,  Applied  Math  & 
Physical  Sciences 
Princeton,  NJ  08540 
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1  Science  Applications,  Inc. 
ATTN:  Tech  Lib 

P.0.  Box  2351 
LaJolla,  CA  92037 

1  Systems,  Science,  &  Software 
ATTN:  Tech  Lib 
P.0.  1620 
LaJolla,  CA  90237 

1  Terra  Tek,  Inc. 

ATTN:  Tech  Lib 

420  Wakara  Way 
Salt  Lake  City,  UT  84108 

1  TRW  Systems  Group 

ATTN:  Tech  Info  Ctr/S-1930 

One  Space  Park 
Redonodo  Beach,  CA  92078 

1  Union  Carbide  Corporation 
Chemicals  and  Plastics 
ATTN:  H.J.  Britt 
P.0.  Box  8361 
Charleston,  WV  25303 

1  California  Institute  of  Tech 
Guggenheim  Aeronautical  Lab 
ATTN:  Tech  Lib 

Pasadena,  CA  91104 

1  California  State  College 
ATTN:  R.F.  Dennemeyer 
San  Bernardino,  CA  92407 

1  Colorado  State  University 
Department  of  Mathematics 
ATTN:  G.D.  Taylor 

Fort  Collins,  CO  80523 
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1  University  of  Colorado 

Computer  Science  Department 
ATTN:  R.  Schnabel 
Boulder,  CO  80304 

1  Cornell  University 

Computer  Science  Department 
ATTN:  J.  Dennis 

Ithaca,  NY  14850 

1  University  of  Delaware 

College  of  Arts  and  Sciences 
ATTN:  M.Z.  Nashed 
501  Kirkbride  Ofc  Building 
Newark,  DE  19711 

1  Franklin  Institute 
ATTN:  Dr.  Carfagno 

Race  &  20th  Streets 
Philadelphia,  PA  19103 

1  The  George  Washington  Univ. 
Department  of  Operations  Res, 
ATTN:  A.V.  Fiacco 
Washington,  DC  20234 

1  The  Johns  Hopkins  University 
Applied  Physics  Laboratory 
Johns  Hopkins  Road 
Laurel,  MD  20810 

1  The  Johns  Hopkins  University 
Mathematical  Sciences 
ATTN:  R.H.  Byrd 

Baltimore,  MD  21218 

1  University  of  Maryland 
Baltimore  Campus 
Mathematics  Department 
ATTN:  P.  Kumar 

Baltimore,  MD  21228 


1  University  of  Maryland 
ATTN:  G.W.  Stewart 
College  Park,  MD  20742 

1  University  of  Maryland 

Department  of  Mathematics 

ATTN:  G.  Young 

College  Park,  MD  20742 

1  Massachusetts  Institute  of 

Technology 

Department  of  Aeronautics 
and  Astronautics 
ATTN:  Tech  Lib 
77  Massachusetts  Avenue 
Cambridge,  MA  02139 

1  Ohio  State  University 

Department  of  Aeronautics 

and  Astronautical  Engineering 
ATTN:  Tech  Lib 
Columbus,  OH  43210 

1  Ohio  State  University 

Department  of  Geodetic  Science 
ATTN:  R.H.  Rapp 
1958  Neil  Avenue 
Columbus,  OH  43210 

1  Pennsylvania  State  University 

Computer  Sciences  Department 
ATTN:  P.A.D.  DeMaine 
University  Park,  PA  16807 

1  Polytechnic  Institute  of 

Brooklyn 
Graduate  Center 
ATTN:  Tech  Lib 
Farmingdale,  NY  11735 

2  Southwest  Research  Institute 
ATTN:  W.E.  Baker 

A.B.  Wenzel 
8500  Culebra  Road 
San  Antonio,  TX  78206 
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1  Stanford  University  . 

ATTN:  G.H.  Golub 
Stanford,  CA  94305 

1  Texas  A&M  University 
ATTN:  S.B.  Childs 
College  Station,  TX  77843 

1  The  University  of  Texas 
ATTN:  D.R.  Barker 
5323  Harry  Hives  Boulevard 
Dallas,  TX  75234 

4  University  of  Wisconsin 

Mathematics  Research  Center 
ATTN :  N .  Draper 
P.  Newbold 
D .  Gay 
C.  deBoor 
610  Walnut  Street 
Madison,  WI  53706 

Aberdeen  Proving  Ground 

Dir,  USAMSAA 
ATTN :  DRXSY-D 

DRXSY-MP,  H.  Cohen 
Cdr ,  USATECOM 

ATTN :  DRSTE-TO-F 
Dir,  USACSL,  Bldg.  E3516,  EA 
ATTN :  DRDAR-CLB-PA 
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